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I,  INTRODUCTION 


The  object  of  tnis  study  nas  been  to  investigate  the  anthro- 
pometric measurements  of  students,  in  particular  the  height  and 
weight  of  college  freshmen.  In  connection  with  anthropometric  meas- 
uraaients  the  statement  is  often  made  by  well-known  statisticians 
that  such  measurements  follow  the  normal  probability  curve  - in  fact, 
that  statement  is  almost  an  axiom  in  such  studies.  It  has  been  my 
purpose  to  investigate  whether  or  not  that  is  the  case,  and  if  not, 
What  curve  will  fit  those  measuremients  more  closely. 

In  connection  with  such  an  investigation  there  are  certain  con- 
stants which  are  always  of  interest;  these  include  several  averages, 
and  a measure  of  the  spread  of  the  data.  The  three  common  formiS  of 
average  are  the  arithmetic  mean,  the  median,  and  the  mode,  while  the 
standard  deviation  is  the  usual  mieasure  of  the  spread  of  the  data. 

The  calculation  of  these  constants  was  the  first  step  in  this  study. 

The  question  of  the  dependence  of  weight  upon  height  is  one 
that  often  arises.  This  too  has  been  investigated,  and  the  usual 
m.easure  of  the  relationship  (the  coefficient  of  correlation)  calcu- 
Idt  ed. 

In  particular,  one  object  has  been  to  set  down  on  paper  a sta- 
tistical description  of  certain  classes  of  students  at  the  University 
of  Illinois  in  the  years  just  following  the  war,  this  description  to 
be  used  later,  for  comiparison  purposes. 

II.  DESCRIPTION  OP  DATA 

Every  year  the  University  of  Illinois  requires  that  the  studeni 
entering  that  Institution  for  the  first  time  take  a physical  examin- 
ation. The  informxation  obtained  from  this  examination, which  include^ 
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various  physical  measurements,  is  recorded  on  individual  cards  for 
each  person  and  is  then  filed  in  the  offices  of  the  University  Health 
Service  Department  (See  Table  I).  The  date  of  exaniinat ion  and  the 
classification  and  age  of  the  student  are  also  recorded  on  this  card, 
so  that  it  is  possible  to  get  the  heights  and  weights  of  men  and 
women,  taken  when  they  were  freshmen.  As  a result  of  this  system, 

I was  able  to  get  sufficient  data  to  separate  the  men  into  classes 
according  to  age,  so  that  I have  the  measurements  of  1004  men  who 
were  18  years  old  when  they  entered  the  university  as  freshmen,  of 
991  Who  were  19  at  the  same  time,  6b8  who  were  17,  and  877  who  were 
20.  The  age  recorded  is  the  one  given  by  the  student  himself  and 
means  that  that  was  his  age  on  his  last  birthday,  rather  than  his 
age  to  the  nearest  birthday. 

Due  to  the  fact  that  there  are  fewer  women  in  the  university, 

I was  unable  to  get  enough  individuals  to  divide  into  classes  as  I 
did  the  men,  ana  instead,  i have  the  heights  and  weights  of  991 
women  of  ages  16-26  (the  range  of  the  ages  of  a very  large  proport ic:  , 
of  the  991  women  is  17-20  inclusive).  This  does  not  give  as  satis- 
factory results  as  one  could  desire,  for  it  leaves  nothing  for  com- 
parison at  the  present  time,  and  the  more  or  less  heterogeneity  of 
the  group  makes  the  results  of  less  worth. 

Then  as  a matter  of  interest  and  comparison,  I made  a frequen- 
cy distribution  of  the  mieasurements  of  the  men,  of  ages  17  to  20 
inclusive,  in  the  present  freshman  class  (1920-1921), 

It  is  understood  that  all  of  the  mieasuremient s recorded  are 
measurements  of  American  born  students. 
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TABLE  I 
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FAMILY  HISTORY 


{jJate 


f Racial  '] 

! extraction  J 

j in  the  vear 

f Racial  I 

J Wellj  has 

Lextraction  I 

, in  the  year 

of. 

uaU  of  Exam.) 


What  relatives  have  had  Tb? Cancer?.,.. 

Neurasthenia? Epilepsy?.. 

Other  possibly  inherited  disease? 


PERSONAL  HISTORY 


Birthplace 

What  injuries?  {Give  age) 

What  operations?  (G/ait  age) 

Age  of  last  vaccination  scar;  under  10  yrs. 


.,  10  to  20  yrs over  20  yrs... 


Have  done ; also 

(Mental  work  other  than  schooling)  (physical  tc'ork) 

Are  you  a waiter,  dish-washer  or  cook? Where? 

Present  general  health Appetite Sleep 

Tea  and  Cof. Tob Ale Drugs., 


.hrs. 


Disease 

Measles 

Rubella 

Mumps 

Chicken-pox 

Whooping  Cgh- 

Scarlet  Fever 

Typhoid  Fever- 

Typhoid  Vac. 

Diphtheria 

Meningitis 

Malaria 

Smallpox 

Smallpox  Vac._ 

Pneumonia 

Asthma 

Pleurisy 

Rheumatism 

Amygdalitis 

Chorea 

Influenza 

Otitis  Media 

Gonorrhea 

Syphilis 

Constipation 

Dysentery 

Appendicitis 

Neurasthenia 

Poliomyelitis 

Tuberculosis 

Glasses 


Age 


PHYSICAL  EXAMINATION 

^Gen.  DevL;  exc.,  good,  fair,  poor.  W.;  thin,  av.,  obese.  lbs ins. 

Build:  stocky,  medium,  slender.  Head:  length ^.'9. cms.,  width (rT"..  ems 

Eyes:  blue,  gray,  dark  gray,  greenish,  hazel,  dark.  Hair:  fair  (flaxen,  reddish,  light  brown),  brown,  dark 
brown,  black. 


‘Skin:  type. 


T . . funder  15  mm. 

Hcne\ ^Vac.:  R.  L.  arm,  leg;  pitted,  keloidal,  smooth;  ! 15-20  mm. 


Temp.. 


Lover  20  mm. 


‘■Teeth:  876543211234567  8)  Remarks., 
87654321  1234567  8 j 


"^Thyroid F.  H.. 


I Present 
Res.-^  Past 

I Adolescent 


Epi., 


.Exp. 


'■"Lymph  N.:C Ax Ing 

"Chest:  norm "Lungs:  norm In.sp 

"Heart:  rate  recumb , erect , norm V.  C cc. 

“B.  P.  (max.),  recumb mm.,  erect mm.  B.  P.  (min.) 

"Abdomen:  norm.,  rigid,  relax.  "Hernia: 

"Palpable:  Liv , Spl , R.  Kid , L.  Kid _. , Other 

"Knee  jerk:  R , L "Penis:  norm.,  circum.  ^''Testes:  R L "Varic:  R L 

^^Urine:  Col Sp.  gr , R , Alb , S 

(Kyphosis 

"Vertebral  column-J^  Lordosis  "Feet:  Long  arches  ) R Anterior  arches  ) R 

^Scoliosis  ) L ) L 

"Other  joints: 

"Nose:  Nor.,  Spur.,  Div.,  C.C.Rh.,  Turg.  Rh.,  Hyp.  Rh.,  Atrop.  Rh.  "Adenoids:  L.,  S.  ^Chr.  Pharyn. 

"Tonsils:  Nor.,  Ab.,  Bur.,  Proj.  Path "Larynx: 

"Ears:  Nor.,  Cer.,  T.  T.  Chr.  S.,  Wch Speh Whisp 

"Eyes:  Lids:  nor Muscles;  nor Fundus;  nor ;Col.  vis.:  nor 

Refraction : O.D O.S 


Defects 
Treated  (t) 
Corrected  (c) 


10 


11 


12 


13 


14 


15 


16 


17 


Wl9 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 


31 


32 


Examiner. 
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III,  THE  EREQUEHCY  DISTRIBUTIONS 

When  the  height  and  weighs  of  tnese  4481  students  had  been  re- 
corded according  to  the  classifications  mentioned  above,  I was  able 
to  make  twelve  frequency  distributions,  that  is,  distributions  of 
height  and  of  weight  for  17  year  , 18  year  , 19  year  , 2u  year  ,men^ 
for  women,  and  for  the  men  of  one  freshman  class  (Tables  II  , III). 
For  heights  I have  used  a class  interval  of  one  inch,  the  frequencies 
as  given  in  the  tables  representing  the  number  of  students  whose 
height  lies  in  the  interval  of  .5  Inch  on  each  side  of  the  class  as 
given;  for  example,  a man  is  recorded  as  68  inches  tall,  if  he  is  in 
reality  between  the  limits  67.5  and  68.5  inches.  The  central  points 
such  as  68  are  called  the  mid-ordinates,  since  ail  the  frequencies 
are  assumed  to  be  concentrated  at  these  mid-points  of  the  intervals 
Of  the  base.  When  heights  were  given  as  the  limits  of  intervals 
(e.g. ,67.5),  the  unit  was  aiviaed,  with  one  half  given  to  each  mid- 
ordinate  just  above  and  below  tne  real  value.  The  weights  were  dis- 
tributed in  class  intervals  of  5 pounds,  in  order  to  make  fewer 
classes  to  compute,  and  because  with  intervals  of  1,  tne  dlstributioi 
would  nave  been  m.ore  or  less  discontinuous  and  by  no  means  smooth. 

As  it  is,  in  several  of  the  distributions  the  frequencies  do  not  run 
smoothly,  but  show  frequent  jutting  points  in  tne  frequency  polygons 
(See  Figs,  7,8,9,10,11).  The  multiples  of  5 were  taken  as  the  mid- 
ordinates in  the  case  of  weights  and  when  the  latter  were  given  as 
the  limits  Of  intervals  (e. g.  , 152%5),  the  unit  was  divided  as  in 
heights. 

IV.  CALCULATION  OF  AVERAGES 

Although  frequency  distributions  help  to  make  any  series  of 
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TABLE  Ila 


Frequency  Distributions  of  Heights  of  College  Freshmen 


Freshman  Class 

1920-21 

Women 

18 

years 

Height 

Actual 

Grad. by 

Grad. by 

Actual 

Grad. by 

Grad. by 

Actual 

Grad. by 

in 

Freq. 

Normal 

Type  IV 

Freq. 

Normal 

Type  IV 

Freq. 

Normal 

inches 

Curve 

Curve 

Curve 

55 

1.0 

0.1 

0.3 

56 

1.0 

1.0 

57 

1.0 

3.0 

3.0 

58 

11.0 

9.5 

9.0 

59 

0.5 

0.  2 

0.  3 

23.0 

26.0 

34.0 

1.5 

0.2 

60 

1.5 

1.0 

1.0 

58. 0 

57.0 

54.0 

1.5 

0.8 

61 

4.0 

4.0 

4.0 

94.5 

101.0 

101.5 

3.0 

3.0 

63 

b.  5 

11.0 

10.0 

168.0 

147.5 

153.0 

4.0 

9.  0 

63 

32 . 0 

28.0 

26.0 

168.0 

174.5 

180.0 

21.5 

23.  0 

64 

54,0 

59.0 

57.0 

155.0 

168.0 

170.0 

50.0 

48.0 

65 

107.5 

104.0 

104.0 

147.5 

131.0 

138.0 

88.5 

87,0 

66 

163.5 

154.0 

157.5 

73.5 

83.0 

79.0 

123.5 

130.0 

67 

190.5 

190.0 

195.5 

40.5 

43.0 

41.0 

165.5 

162.0 

68 

187.0 

197.0 

200.0 

22,5 

18.0 

18.0 

172.5 

169,0 

69 

175.5 

171.0 

169.0 

5.5 

6 . 0 

7.0 

160.0 

147.5 

70 

125.0 

124.0 

119.5 

0.5 

2.0 

3.0  ■ 

98.0 

107.0 

71 

74.0 

76.0 

72.0 

0.5 

0.4 

1.0 

53.5 

65.  0 

72 

34.  5 

39,0 

37.0 

0.1 

0,3 

41.0 

33.0 

73 

18.0 

16.5 

17.0 

1.0 

13.5 

14.0 

74 

3.5 

6 . 0 

7.0 

0.1 

4.5 

5.  0 

75 

3.  0 

2.0 

3.0 

0.5 

1.0 

76 

2.0 

0.4 

1,0 

1.5 

0.4 

77 

0.5 

0.1 

0.3 

1182.0 

1183.2 

1181.1 

971.0 

971. 1 

“972. 1 

100470 

1003.9 

V 
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TABLE  Ilb 


Frequency  Distributions  of  Heignts  of  College  Freshmen 


17 

year 

19 

year 

20 

year 

Height 

Actual 

Graduated 

Actual 

Graduat  ed 

Actual 

Graduat  ed 

in 

Freq. 

by  Normal 

Freq. 

by  Normal 

Freq. 

by  Normal 

inches 

Curve 

Curve 

Curve 

56 

2.0 

■ 

57 

58 

59 

60 

1.0 

0.5 

1.0 

1.5 

0.4 

61 

4.0 

2.0 

1.0 

3.0 

1.5 

2.0 

62 

6.  0 

5.5 

D.  0 

8.0 

6.5 

6.  0 

65 

13.0 

14.0 

26.0 

21.0 

11.5 

15.0 

64 

33. 0 

30.0 

40.0 

45.0 

33 . 5 

35.0 

65 

40.5 

53. 0 

84.5 

81.0 

61.0 

66.  0 

b6 

77.5 

79.5 

122.0 

122.0 

112.0 

104.5 

67 

109.0 

100.0 

158.5 

155.0 

145.5 

137.0 

68 

106.5 

106.0 

155.0 

165.0 

152.5 

150.0 

69 

89.5 

94.0 

136.0 

147.0 

129.5 

136.  5 

70 

71.0 

70.0 

138.5 

111.0 

91.0 

103.0 

71 

51.5 

44.  0 

71.5 

70.0 

75.5 

65. 0 

72 

24.0 

23.0 

28.0 

37.0 

35.0 

34.0 

73 

8.  5 

10.0 

13.0 

17.0 

16.0 

15.0 

74 

2.0 

4.  0 

7.0 

6 . 0 

1.0 

5.  0 

75 

1.0 

1.2 

3.0 

2.0 

0.5 

2.0 

76 

2.5 

0.4 

77 

0.5 

0.1 

638. 0 

636 . 7 

991.0 

991.0 

877.0 

876.8 
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TABLE  Ilia 

Frequency  Distributions  of  Weights  of 
Fresnman  Class  1920-21  Women 

7 

College  Freshmen 

18  year 

Weight 

Actual 

Grad,  by 

Grad. by 

Actual 

Grad. by 

Grad. by 

Actual 

Grad. by 

in 

Freq. 

Normal 

Type  VI 

Freq. 

Normal 

Type  V 

Freq. 

Norm^al 

pounds 

Curve 

Curve 

Curve 

85 

N 

5.0 

19.5 

5.5 

90 

1.0 

4.0 

0.1 

30. 5 

52. 0 

35,0 

95 

4.0 

8.0 

1.0 

55.  5 

47.0 

o6.  0 

4.0 

6, 5 

100 

10.0 

15,5 

7.0 

85.0 

65. 5 

92.0 

7.0 

12.5 

105 

15.0 

27.5 

21.0 

119.0 

84.0 

118.0 

16.0 

22.  5 

110 

56. 5 

44.0 

47.0 

ia2,5 

99.0 

127.0 

52.0 

57.6 

115 

71.5 

b5. 5 

80.0 

135,0 

108.0 

121.0 

58.0 

o6.  5 

130 

115.5 

89,0 

112.0 

98.0 

108.5 

105.0 

85.5 

78.0 

125 

155.  5 

111.5 

154.  5 

85. 5 

101.0 

65.  0 

145.5 

98.5 

IdO 

150.  5 

128.  5 

145.0 

66.0 

87.0 

6b.  0 

120.5 

114.0 

155 

128.5 

156.  5 

159.0 

58. 5 

69.0 

50.0 

126.0 

120.0 

140 

155.5 

135.0 

135.5 

42.0 

50.  0 

56 . 0 

129.5 

115.5 

145 

101.5 

119.0 

103.0 

25. 5 

54.0 

26.  0 

78.5 

102.0 

150 

bb.  0 

98.0 

81.0 

14.5 

21.0 

18.0 

60.5 

83.0 

155 

51.5 

74.0 

61.0 

14.5 

13.0 

15.0 

45.  5 

60.5 

160 

54.5 

51.5 

44.0 

5.  0 

7.0 

9.0 

58.0 

40.5 

165 

54.5 

55 . 0 

50.0 

5.0 

5.  0 

6.  0 

21.0 

25.0 

170 

20.5 

19.5 

20.5 

6.0 

1.5 

4.0 

12.6 

14.  0 

175 

11.5 

10.5 

13.5 

2.0 

1.0 

5.  0 

7.0 

7.0 

180 

7.5 

5.5 

9.0 

2.0 

0.2 

3.0 

11.0 

3.5 

185 

b.O 

2.5 

5.0 

0.0 

0.1 

1.5 

1.0 

1.5 

190 

1.0 

1.0 

5.0  , 

1.0 

1.0 

1.5 

0.5 

195 

4.  0 

0.4 

2.0 

1,0 

0.7 

6.  5 

0.2 

200 

5.  0 

0.1 

1.0 

2.0 

0.5 

205 

1.0 

0.7 

1,0 

0.  5 

210 

1.0 

215 

1.0 

255 

240 

285 

1182.0 

1178.0 

1181.8 

971.0 

950.  5 

969.  5 

1004.0 

998.2 

alt 
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TABLE  Illb 

Frequency  Distribucion  ci  nyeignts  of  College  Fresiimen 


17 

year 

19 

year 

20 

year 

Weight 

Actual 

GracLuat  ed 

Actual 

Graauat  ed 

Actual 

Graauat  ed 

in 

Freq. 

by  Normal 

Freq. 

by  Normal 

Freq. 

by  Normal 

pounas 

Curve 

Curve 

Curve 

90 

2.0 

3.0 

95 

2,0 

6.0 

1.0 

5,0 

100 

2.5 

11.0 

6 , 5 

10,0 

4.0 

9.0 

105 

11.5 

18.0 

9.5 

18.0 

8.  5 

16.0 

110 

20,0 

27.0 

31.5 

31.0 

23.  5 

26.0 

115 

39.5 

39.  0 

53.0 

48  . 0 

30.5 

39.  5 

120 

64.0 

51.0 

96.  5 

68.0 

63.  0 

55.5 

125 

84.0 

62.  0 

85.0 

89.0 

88.5 

72.0 

130 

82.5 

69.0 

117.0 

106.0 

122,5 

87.0 

135 

89.0 

71.5 

128.0 

116.0 

107.0 

96.0 

140 

70.5 

68.  5 

123.5 

116.0 

111.0 

98.5 

145 

53.0 

61.0 

101.0 

106.0 

80.5 

93.0 

150 

48.5 

49.5 

64.5 

89.0 

55.0 

81.0 

155 

19.0 

37.0 

54.5 

68.  0 

65.  5 

66. 0 

160 

17.0 

26.0 

34.5 

48.0 

45.0 

49.0 

165 

8.0 

17.0 

40.0 

31.0 

29.5 

34.0 

170 

8.0 

10.0 

21.0 

18.0 

15.0 

21.0 

175 

5.5 

5.  5 

6,0 

10.0 

4.5 

13.0 

180 

4.5 

3.0 

3.0 

5.0 

6.5 

7.0 

185 

0.0 

1.0 

10.0 

2.0 

2.0 

3.0 

190 

2,5 

0.6 

2.0 

1.0 

5.0 

2.0 

195 

0.5 

0.2 

1.0 

0.3 

1.0 

1.0 

200 

1.0 

0.1 

2.5 

0.1 

2.0 

0.3 

205 

0.5 

1.5 

0.1 

210 

2.5 

235 

1.0 

240 

1.0 

245 

1,0 

255 

1.0 

2^5 

1.0 

638.0 

636.9 

991.0 

980.4 

877.0 

874.9 
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ODservaD ions  conprenensible,  tnere  is  a need  for  quantitative  ex- 
pressions to  cnaracterize  the  histributions.  Two  ways  in  which  ordi- 
nary distributions  may  differ  are:  (1)  in  position,  that  is,  in  tne 
values  ot  the  variable  around  wnicn  they  center,  and  (2)  in  the  rang? 
of  variation.  Expressions  which  measure  position  are  usually  called 
averages,  of  which  three  common  ones  are  the  arithmetic  mean,  the 
m.edian,  and  the  mode.  The  more  closely  a distribution  can  be  fit  by 
the  normal  curve  the  more  nearly  will  these  three  measures  approach 
coincidence.  Measures  of  the  range  of  variation  are  termed  mieasures 
of  dispersion,  of  wnlch  the  mxst  useful  is  the  standard  deviation. 

The  values  of  these  expressions,  for  the  distributions  in  the  present 
study,  have  been  calculated. 

Tne  mode  of  a frequency  distribution  is  the  class  which  has  the 
greatest  frequency.  since  the  division  into  classes  is  arbitrary, 
the  value  of  the  mode  thus  obtained  can  only  be  approximate.  A m.ore 
accurate  definition,  as  given  oy  Gr.U . lule^  is:  ’’The  mode  is  the 
value  Of  tne  variable  corresponding  to  the  maximum  of  the  ideal  fre- 
quency curve  wnicn  gives  tne  closest  possible  fit  to  the  actual  dis- 
tribution. ” Tnis  theoretical  mode  is  obtained  in  several  cases 
under  tne  discussions  of  the  types  of  frequency  curves.  The  emiplri- 
cal  or  approximate  value  of  this  average  can  be  easily  obtained  from 
tne  frequency  polygons. 

The  median  is  tnat  value  of  tne  variable  of  an  ordinary  fre- 
quency distr lout  ion,  on  eacn  side  of  wnich  tnere  are  an  equal  number 
of  observations.  The  m^edian  for  each  of  the  given  distributions  was 
obtained  by  simple  interpolation,  which  gives  the  value  only  approx- 
imately, since  sucn  a mietnod  assumes  that  the  values  in  each  class 

1.  ”An  Introduction  to  the  Tneory  of  Statistics”,  p 120. 
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are  uniformly  distrilDuted  throughout  tne  interval.  Using  the  figures 
of  the  freshman  group  to  illustrate,  tne  total  number  of  observations 
is  1182,  he-if  of  which  is  byi.  Fromi  tne  table  of  distribution  for 
this  group  (Taole  ila),  we  see  tnat  there  are  5b9  students  wnose 
heignt  is  not  greater  than  67.5  incnes,  and  187  more  whose  height 
comes  in  tne  interval  67.5  to  68.5.  however,  only  62  are  required 
to  ria,ke  up  the  total  of  591,  so  tnat  the  value  of  the  median  is 

67.5  + T^.l  = 67.5  + .171 
187 

= 67.671  inches. 

The  fraction  added  must  be  multiplied  by  the  class  interval,  which  ii 
this  case  is  unity. 

The  arithmetic  mean  is  the  particular  form  of  average  which  is 
commonly  termed  the  mean  or  the  average  value.  It  is  defined  by  the 
formula 


in  which  X represents  the  various  values  of  a variable,  f is  the  fre- 
quency of  each  variable,  and  N is  the  total  number  of  observations, 
■To  calculate  the  arithmietic  means  of  the  various  distributions,  a 

modification  of  Mr.  Hardy’s  summation  method,  as  given  by  W. P.  Elder- 

1 2 
ton  , was  used.  By  m.eans  of  this  mjethod  the  monients  are  calculated 

and  it  is  these  which  lead  to  the  criterion  k,  which  shows  the  type 

of  frequency  curve  (as  they  have  been  classified  by  Karl  Pearson)  to 

which  the  data  belong.  The  sumimation  method  is  shown  by  Table  IV  in 

which  I calculated  the  momients  for  the  distribution  of  heignts  of 

the  freshman  group.  Using  a central  term  as  the  starting  point  for 

the  summiation,  we  get  S3  whicn  is  the  difference  between  the  sums  on 

1.  Frequency  Curves  and  Correlation,  1917  Edition,  pp  22-33. 

2.  For  a discussion  of  the  method  of  moments  see  Elderton,  "Frequen- 
cy  Curves  and  Correlation",  Chap. III. 


TABLE  IV 

Summation  Method 
for 

The  Calculation  of  Moments  and  Other  Constants 
Example:  Distribution  of  Heights  of  1183  Freshmen 
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■equency 

1st  Sum 

3nd  Sum 

3rd  Sum 

4th  Sum 

5th  Sum 

.5 

,5 

.5 

.5 

.5 

.5 

1.5 

3.0 

3.5 

3.0 

3.  5 

4.0 

4.0 

6.0 

8.5 

11.5 

15.0 

19.0 

5.5 

11,5 

30.0 

3l.  5 

46,5 

65. 5 

33.0 

54.0 

107.5 

163.5 

43.5 

97.5 
305.0 
368. 5 

63.5 
161.0 
366. 0 
734.5 

95.  0 

356.0 

633.0 

141.5 

397.5 

307.0 

190.5 

187.0 

633.0 

1531.0 

3433 . 5 

6938.5 

13385.0 

175.5 

436.  0 

938.0 

1861,5 

3506. 0 

6356. 5 

135.0 

360,5 

503.0 

933,5 

1644.5 

3850.5 

74.0 

13b.  5 

341.5 

431,5 

731.0 

1306.0 

34.  5 

61.5 

106.0 

180.0 

399 , 5 

485.0 

18,0 

37.0 

44.5 

74.0 

119.5 

185.5 

3.5 

9.0 

17.5 

39,5 

45.5 

66.0 

3.0 

5.5 

8.5 

13.0 

16.0 

30.5 

3.0 

3.5 

3.0 

3.5 

4.0 

4.5 

0.5 

0.5 

0.5 

0.  5 

0.5 

0.5 

S3 


1561  734.5 


1183  1183 

o _ ^433.5  . 633 

1183  1183 


.69933858 


3.43174380 


6938  5 397.5 

S4  = ^ItIs  - -TleZ  = 5.  53538071 


= -^1455160 


^3  = SS3  - d(l+d)  = 5.65531343 

^3  = SS^  - 3^(l+d)  - d(l+d)(3+d)  = 1.11594631 

^4  = S4S5-  3/^[3(l+d)+lj  -/<2[6(l+d)  (3+d)-l] -d(l+d)  (3+d)  (3+d) 

^1  = .00688530  /^g  = 3.19830317  103.38991868 


m: 


/^i  ) 


A. 


^ -m (^:3“^6Tf4^-i'0iT  = • °1277189 


_ 8(03-%-!)  = i4. 97401766  m = i(r+2)  = 18.48700883 

P./^  —A  (O'  ' 


203-^01-6 


V _ r(r-3)./6i 

VTSU-lj-^tr-S)' 


= (-)4. 13388736,  negative  because  5 is  posi 

tiV€ 


a = y-^16(r-l)-/3^(r-3)2^  = 13,76545077 


1 

1 

« 

t 


i 

i 


! 


''I 


I 


I 


'i' 


I 
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each  side  of  the  central  point;  in  this  work  it  is  an  advantage  to 
consider  tne  total  frequency  as  unity,  hence  the  sums  of  the  columns 
must  be  divided  by  M,  the  total  number  of  observations,  in  order  to 
keep  the  work  consistent.  Sg  is  d,  the  difference  between  the  act- 
ual mean  of  the  distribution  and  the  arbitrary  starting  point.  Since 
the  starting  point  in  the  example  was  190.5  , which  corresponds  to 
class  67,  and  since  d is  .699,  the  actual  mean  of  the  data  is  67.699 
inches.  Using  the  same  group  (freshman  class)  as  an  illustration  of 
the  calculation  of  the  mean  for  weight,  we  find  150,5  taken  as  the 
central  point,  which  corresponds  to  class  130;  summing  just  as  in 
the  case  of  heights,  d was  found  equal  to  1.187,  However,  instead 
of  adding  this  directly  to  tne  arbitrary  point  130,  it  was  necessary 
first  to  multiply  it  by  5 in  order  to  get  it  in  terms  of  the  class 
interval  wnich  is  5.  Thus  the  mean  weight  of  the  men  of  one  fresh- 
man class  is  135.9  pounds  (Table  V), 

The  3.rithmetic  means  of  the  various  distributions  for  the 
height  of  men  are  very  close  together,  varying  between  67.7  and  68 
inches.  The  20  year  group  averaged  the  tallest  at  67.98  inches, 
with  the  19  year  next  at  67.36,  then  the  17  year  at  67.83,  and  the 

18  year  at  67,74;  the  group  of  freshmen  men  averaged  67.70.  The 
range  on  each  side  of  the  mean  is  8 or  9,  with  an  exception  in  the 

19  year  group  in  which  two  men  were  only  56  inches.  The  women 
averaged  63.31  inches. 

There  was  a very  great  range  in  several  of  tne  weight  distri- 
butions, for  example,  one  of  the  17  year  men  weighed  285  pounds, 
although  tne  distribution  was  by  no  means  continuous  that  far.  The 
range  of  continuity  was  usually  90  to  210  pounds.  The  20  year  group 
averaged  not  only  tallest  but  heaviest  with  an  average  weight  of  139 
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Summation  Method 
for 

The  Calculation  of  Moments  and  Other  Constants 


Example : Distribution  of  Weights  of  1183  Freshmen 


Frequency 

1st  Sum 

2nd  Sum 

3rd  Sum 

4th  Sum 

5th  Sum 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

4.0 

5.0 

6.  0 

7.0 

8.0 

9.0 

10.0 

15.0  ■ 

31,0 

28.0 

36.  0 

45.0 

15.0 

30.0 

51.  0 

79.  0 

115.0 

160.0 

36,  5 

66 . 5 

117.5 

196.5 

311.5 

471.  5 

71.5 

138.0 

256.  5 

452.0 

763.5 

115.5 

253.  5 

509,0 

961.  0 

153.5 

407.0 

916.0 

150.5 


128.5 

624.  5 

2319.5 

7622.0 

33017.5 

65016. 5 

13o.  5 

496.0 

1695 . 0 

5302.5 

15395.5 

41999. 0 

101.5 

362,  5 

1199.0 

3607.5 

10093,0 

26603. 5 

66.  0 

261.0 

836.5 

2408.5 

6485.5 

16510.5 

51.5 

195.0 

575.5 

1572.0 

4077.0 

10025.0 

54.5 

143. 5 

380.  5 

996.5 

2505.0 

5948.0 

34.  5 

89.0 

237.0 

616,0 

1508.5 

3443. 0 

20.5 

54.5 

148.0 

379.0 

892.  5 

1934.5 

11.5 

34,0 

93.5 

231.0 

513.5 

1042.0 

7.5 

22.5 

59.  5 

137.5 

282.5 

528.5 

6.0 

15.0 

37.0 

78.0 

145.  0 

246.0 

1.0 

9.0 

22.0 

41.0 

67.0 

101.0 

4.0 

8.0 

13.0 

19.  0 

26.0 

34.0 

3,0 

4,0 

5.0 

6.0 

7.0 

8.  0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

Q-  _ 3319 . 5 _ 916  _ 1 107'zq/jot:;  q _ 33017,5  _ ^63. 5 _ tq  007/ t tic 

^2  - 1182  “ 1.18739435  S^  1183“  HB3  “ 18.83741116 


o _ 7633  . 961 

^3  TT^  TT^  = 

= 11.93554339 


7.36143133 


65016.5  . 
— ri8~ 


471.5 

116^ 


=55. 40439933 


= 36,ft3835548 

^ = 523.99711354 
^ 2 

j AA^  '4. 

1 = -f  = . 411815 

-^3 


A^A 

— =-  = 3.  684<i:49 

^2 


k = 2.553all 


1'4 


pounds;  the  other  groups  follow  in  good  order,  the  19  year  averaging 
137,5,  the  18  year  135.4,  ana  the  youngest  group  weighing  lightest 
with  134.7  pounds  as  an  average.  The  mean  weight  (117.9)  of  the 
women  is  considerably  less  than  those  of  the  men,  although  the  range 
is  practically  the  same. 

V.  CALCULATION  OF  THE  STANDARD  DEVIATION 
To  measure  the  spread  of  the  data  about  the  mean,  the  stand- 
ard deviation,  O'y  was  calculated.  This  measure  which  is  usually  used! 
to  show  the  dispersion  of  a distribution  is  given  by  the  formula 

/ where  f is  the  frequency  of  a given  interval,  x is  the  dsvi- 

^ n 

at  ion  of  that  interval  from  the  mean,  and  n is  the  total  number  of 
observations.  But  it,  too,  can  be  obtained  by  the  summation  m.ethod 
for  its  square  is  the  same  as  the  second  moment,  that  is,  the  stand- 
ard deviation  is  equal  to  (Tables  IV, V).  The  spread  of  the  data 
from  the  mean  was  least  in  the  distribution  of  women’s  heights,  the 
standard  deviation  being  2.30  inches;  the  dispersion  for  the  other 
distributions  of  heights  was  very  nearly  the  same  for  all  groups: 

30  year-  2.33,  18-  3.35,  freshman  class-  3.38,  17  year-  2.39,  19  year- 
2.40. 

To  find  the  standard  deviation  for  weights  it  must  be  remem- 
Dered  that  in  order  to  get  it  in  terr/.s  of  pounds,  the  class  interval 
of  5 must  be  taken  into  account;  thus,  is  the  symbol  for 

second  moment)  equals  the  standard  deviation  of  the  distribution  of 
(heights.  As  one  would  expect,  the  deviation  is  quite  large  due  to 
bhe' range,  especially  virhen  thought  of  in  terms  of  one  pound.  The 
smallest  standard  deviation  is  16, S8  for  the  18  year  group,  and  the  19 
/■ear  group  is  next  with  (T  as  16.82.  This  can  be  seen  from  the  dis- 
tribution <5f  the  data,  for  in  neither  case  is  the  range  very  great 


■a:,-. 


V-  . . V 


t*w 


4 


fii :« . . 


'.'•'  .tf  i‘  Y’j 


I '^  Oi!u 


At: 


Y 


i'r  ( 

i). 

!»■ 


9 


IC/ 


w*i-w 


. " V • . J 

■ ■.  \ "z\ 


■ W 

' ■ .' 


I 


.-.  w it» 


i.  u 


vt  er.*‘ 


:Vfc^£  . 


• *,'<■ 


I ti^O.^.  pt*  '’r.i,  t 


.p 


.'fL'pe 

0;  . ■■  ji;'.’ 


n 


r. 

A 

ka  M 


' J ■ • .'  . 


U: 


i 


«*«  ^ ‘ 


. V . . T\  , 

'':  '“■■  •:;^r 


',  £• 


TxA^' 


15 


and  in  the  18  year  group  especially  there  is  a noticeable  clustering 
about  the  mean.  The  20  y^ear  and  17  year  groups  with  their  longer 
ranges  have  deviation  of  17.74  and  17.79  respectively.  The  distri- 
bution of  the  weights  of  women  is  spread  over  quite  a large  range 
and  consequently  it,  too,  has  a highCT:  17.74. 

VI.  EXPLANATION  OF  k. 

After  the  moments  are  calculated,  k,  the  criterion  for  Pearsonl 
frequency  curves,  is  immediately  calculable.  As  is  well  known,  the 
normal  curve  does  not  fit  all  types  of  data,  in  fact,  actually  it 
fits  very  few.  In  many  physical,  economic,  and  biologic  investi- 
gations, we  find  the  tendency  to  deviation  on  one  side  of  the  mean 
unequal  to  the  tendency  on  the  other  side.  It  becomes  necessary, 
therefore,  to  find  equations  for  curves  which  would  fit  these  cases, 
and  Karl  Pearson  miet  this  need  in  his  articles  on  "Skew  Variations 
in  Homogeneous  Material",^  "Supplement  to  a Memoir  on  Skew  Variation^ 
and  "Second  Supplement  to  a Memoir  on  Skew  Variation"^.  In  the  firBr 
article  he  deduces  a generalized  probability  curve  which  fits 'Sur- 
prisingly accurately  observations  of  an  asymmetrical  character." 

Then  he  goes  further  in  order  to  meet  the  problem  of  limited  range; 
he  says,  "We  have,  then,  reached  this  point:  that  to  deal  effectively 
with  statistics  we  require  generalized  probability  curves  which  in- 
clude the  factors  of  skewness  and  range.  The  generalized  curve  we 
have  already  reached,  possesses  skewness,  but  its  range  is  limited 
in  one  direction  only. " 

As  we  know,  the  normal  probability  curve  is  obtained 


1.  Phil.  Trans,  of  Royal  Soc.  of  London,  (A)  Vol.  186,Pt.  I,p344,et  se'l, 
• 2.  Phil. Trans. Royal  Soc.  London,  (A),Vo1.197,  p245,  et  seq. 

3.  Phil. Trans. Royal  Soc. London,  (A),  Vol.216,  p 429,  et  seq. 
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approximately  by  running  a continuous  curve  tnrougn  tne  point  bi- 
nomial  + 3)  . By  allowing  n to  increase  indefinitely  and  letting 
the  interval  between  ordinates  approach  zero  the  limiting  curve  is 
the  theoretical  probability  curve.  We  usually  get  this  binomial  ex- 
pansion by  tossing  coins,  for  we  have  the  probability  of  throwing 
heads  equal  to  the  probability  of  throwing  tails'^,  that  is,  the  proba- 
bility of  the  success  and  failure  of  an  event  is  one  half.  The  gen- 
eralized form  of  the  probability  curve  may  be  obtained  from  the  gen- 
eral binomial  (p+q)^,  where  p+q=l,  but  p^q.  The  example  usually 
used  for  this  form  is  the  throwing  of  dice,  where  the  probability  of 
throwing  an  ace,  for  instance,  is  one  sixth,  and  the  proba-bility  of 
not  throwing  it  is  five  sixths.  The  normal  curve  may  be  considered 
as  a special  case  of  this  generalized  form.  The  still  more  general 
form  worked  out  by  Pearson  includes  both  the  normal  probability  and 
the  generalized  probability  curves  as  special  cases.  To  illustrate 
how  a frequency  curve  with  limiited  range,  ana  skewness,  may  arise, 
he  uses  the  example  of  drawing  balls  from  a bag.  "Take  n balls  in 
a bag,  of  which  pn  are  black,  and  qn  are  white,  ana  let  r balls  be 
drawn  and  the  number  of  black  be  recorded.  If  r>  pn,  the  range  of 
black  balls  will  lie  between  0 and  pn;  the  resulting  frequency  poly- 
gon will  be  skew  and  limited  in  range."*  This  polygon  is  given  by 
a hypergeometric  series  from  which  he  obtains  the  differential  equa- 
tion 

1 ^ = a+x  , 

y dx  b4cx+dx^ 

In  oraer  to  get  this  in  the  form.  y=f(x),  it  is  necessary  to  integrate 

3/^X 

, and  the  form  the  integral  takes  depends  upon  the  values 

b+cx+dx® 

of  the  coefficients  of  x in  the  denominator.  Thus  the  criterion  for 
the  form  in  a particular  case  is  the  same  as  that  for  the  nature  of 
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the  roots,  nar/.ely,  the  discriminant:  yc^-4hd,  which  Pearson  puts  in 


the  form  / t,  allowing  to  equal  k.  Then,  according  as  k=0, 

^4bd  “ ^ 

k^l,  or  k<l,  certain  equations  result  and  fit  various  cases  of 
skewness  and  range. 

If  the  differentia^  equation  is  integrated  with  respect  to  x, 
after  multiplying  each  side  by  x^,  according  to  the  method  used  by 
Elderton,^  the  coefficients  of  the  powers  of  x can  be  found  in  the 
terms  of  the  moments.  Letting  /3  = ^ and  02  ~ ^ 


pressed  in  terms  of  the/3’s  as  given  in  Table  IV.  We  have  then,  the 
following  table: 


k<0 

Type 

I 

k = 0 02~^ 

Type 

VII, 

k = 0 

Type 

II 

0<k<l 

Type 

IV 

1 — 1 
II 

Type 

V 

k >1 

Type 

VI 

k = 00 

Type 

III. 

VII.  CLASSIFICATION  OF  DISTRIBUTIONS  ACCORDING  TO  TYPES 
Using  k,  then,  according  to  the  above  table,  I found  that  the 
distributions  of  heights  always  approached  very  nearly  tne  normal 
curve,  in  fact  to  all  appearances  as  near  that  as  to  any  curve,  al- 
though according  to  k ana  0^  the  distributions  of  the  lb)  year  men, 
the  v;omen,  ana  the  freshman  class  groups  were  fit  more  closely  by 
type  IV.  The  17  year  group  with  its  neirative  k is  classed  v/ith  Type 
I.  To  compare  the  fit  of  the  two  curves  (the  curve  of  Type  IV 
and  the  normal  curve)  I calculated  the  equations  of  the  two,  for 
the  heights  of  women  and  of  the  freshman  class,  ana  plotted  them- 
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following  the  method  of  Elderton,  pp  6y-73.  The  calculation  of  the 
constants  for  the  equation  of  Type  IV, 


is  shown  in  Tables  TV  and  VII,  as  worked  out  for  the  freshman  group. 
From  the  graphs  of  the  two  curves  for  both  the  freshman  and  women, 
it  seems  that  the  normal  curve  is  as  close,  if  not  a better  fit  than 
that  of  Type  IV  (Figs.l  and  6).  However,  it  may  be  that  the  e3'’e  can 
not  judge  accurately  from,  figures  the  closeness  of  fit,  and  it  miay 
be  tnat  k has  a relatively  large  probable  error  such  that  the  normal 
curve  is  really  the  better  fit. 

The  types  represented  by  the  distribution  of  weights  are  three* 
Types  IV,  V and  VI.  The  17  year,  18  year,  and  20  year  men  are  class-' 
ified  under  Type  IV,  wnile  the  19  year  and  freshman  groups  are  rep- 
resented by  Type  VI,  and  the  w'eights  of  women  are  fit  best  by  a 
curve  of  Type  V.  From*  the  figures  (7--12)  showing  the  distribution 
of  weights,  it  is  easily  seen  that  they  are  not  fit  by  the  normal 
curve.  In  two  cases  I have  drawn  the  curves  which  fromi  k snould 
better  represent  the  data,  and  from  figures  7 and  12  we  see  that 
they  do.  The  constants  of  the  equation  of  Type  V: 


were  calculated  and  the  ordinates  obtained  as  shown  in  Tables  XI  an 


and  the  calculation  of  these  constants  and  the  curve  is  shcwmi  in 


For  each  of  the  six  groups,  the  frequency  polygon  of  both 


1.  See  Elderton  , pp  78-80. 

2.  See  Elaerton,  pp  8o-85. 


y = 7o  X 


XII?’  The  equation  of  Type  Vi  is 

y = yQ(x-a)^2 


Tables  IX  and  X 


.2 
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height  and.  weight  was  drawn,  and  on  the  polygon  was  graphed  the  nor- 
mal curve  for  the  respective  data.  To  get  the  ordinates  for  the 
normal  curve  I used  Sheppard's  tables,  wnich  require  that  the  devi- 
ations be  expressed  in  terms  of  the  standard  deviation  of  the  given 

data,  tnat  is,  as  The  z which  is  given  in  the  tables  is  equal  to 
1 _ 1 x2 

e ^ ^ for  corresponding  ^ 's.  Since  tne  equation  for  the 

N -1 

normal  curve  is  y = — ;■■■'■  , it  is  necessary  to  multiply  the  z 

crysTT  ® 

given  in  the  tables  by  ^ in  order  to  have  the  proper  ordinates  for 


the  given  distribution.  An  example  is  given  by  Table  VI,  Since  for 
the  normal  curve  the  origin  is  at  the  mode,  and  the  mode  and  the  mear 
are  the  same,  we  have  yo(=  — — ) a-s  the  maximum  and  central  ordinate, 
corresponding  to  the  abscissa  represented  by  the  mean. 


VIII.  CALCULATION  OF  CURVE  OF  TYPE  IV. 


It  was  shown  above  that  the  distribution  of  heights  of  the 
freshman  group  was  among  those  which  were  better  fit  by  type  IV. 
The  equation  of  this  curve  is 


y 


yod 


-V  tan 
e 


and  in  order  to  graph  it,  it  is  necessary  to  evaluate  the  constants 
a,  m,  V,  and  y^.  These  were  calculated  by  means  of  the  formulae  on 
p 69  of  Elderton,  Each  of  these  constants  can  be  expressed  in  terms 
of  another  constant  r,  the  value  of  which  in  terms  of  /3]_  and  /J3  is 
derived  in  Elderton  (pp  74-76).  The  calculation  is  simple  until  the 
integral  G(r,v)  is  met  in  the  formula 

V — N e 

o aG(r , v)  * (a) 

In  deriving  the  value  of  y^  (See  Elderton,  p 74),  the  integral 
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TABLE 

VI 

Calculation  of 

the  Normal 

Curve 

Example: 

Distribution  of  Heights 

of  1004  Freshmen  Men, 

Age  18. 

y = 

x^ 

2(T3 

^0  ® 

V J.. 

- = 170.4923 

^ -v/2tt  jla 

2 

Origin  = Mean 

= 67.73556 

^ = 2.3493 

^ = .42566 

N 

O' 

= 427.3613 

Normal  ^ 

Curve 

He ignt 

Frequency  ^ 

z 

N 

2 • cr 

y 

59 

1.5 

3.71838 

. 0003968 

0.170 

0.2 

60 

1.5 

3.29272 

.0017646 

0.754 

0.8 

61 

3.0 

2.86706 

. 0065461 

3.798 

5.0 

62 

4.0 

2.44140 

. 0202597 

8.658 

9.0 

63 

21.5 

2.01574 

.0523132 

32.357 

22.0 

64 

50.0 

1.59008 

.1136899 

48.16 

48.  0 

65 

88.5 

1.16442 

. 2035293 

86.55 

87.0 

66 

123.5 

0.73876 

. 3036668 

129.78 

130.0 

67 

165.5 

0.31310 

. 3798557 

163.34 

162.0 

68 

172.5 

0.11356 

. 3964193 

169.41 

169.0 

69 

160.0 

0. 53823 

.3451472 

147.50 

147.5 

70 

98.0 

0.96388 

. 2507066 

107.14 

107.0 

71 

53.5 

1.38954 

.1519282 

64.93 

65. 0 

72 

o 

• 

1 — 1 

1.81520 

.0768127 

32.  83 

33. 0 

73 

13.5 

2. 24086 

.0323983 

13.80 

14.0 

74 

4.5 

3.66652 

.0114012 

4.87 

5.0 

75 

0.5 

3.09218 

.0035471 

1.43 

1.0 

76 

1.5 

5.51784 

.0008198 

0.35 

0.4 

1003.837 


1003.9 
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4>  desired;  the  difference  between  Au^  and  its  succeeding  value 

as  given  in  tne  table.  The  value  of  log  H(r,v)  known,  F(r,v)  was  ob- 
tained from  equation  (c)  ana  substituted  in  the  equation 

a ■F(r,v) 

obtained  from  equations  (a)  and  (b) . 

After  the  constants  had  been  evaluated.  Table  VIII  was  made 

and  the  ordinates  calculated.  The  origin  for  Type  IV  is  equal  to 

the  mean  - and  for  the  freshman  group  is  66.07.  The  mode  is 
1 /H'i>  r-2 

the  mean  - — which  in  the  example  cited  is  67.6,  and  the  cor- 

responding  ordinate  202.77. 

Vk.  CALCULATION  OF  THE  CURVE  OF  TYPE  VI 
The  distribution  of  the  xveights  of  the  freshman  group  was 
found  to  approach  most  nearly  the  curve  represented  by  the  equation 
of  Type  VI: 

^2 

y = y^Cx-a)  x ^ . 

The  three  constants,  a,  qg,  q^  can  be  expressed,  too,  in  terms  of  /3;j_, 
and  r,  which  is  derived  in  the  discussion  of  Type  I,  pages  59-60,  of 
Elderton.  However,  in  getting  the  value  for  y^,  it  is  necessary  to 

evaluate  the  so-called  T functions.  in  deriving  the  formula  for  y^, 

XT  "p—i  q— 1 

e^  (1-x)  dx  appears,  and  is  another  one 
which  can  not  be  expressed  in  terms  of  elementary  functions.  This 
integral  is  called  the  fi  function  and  is  represented  by  the  symbols 

/3(p,q) . There  is  another  so-called  non-integrable  expression: 

p-1  -n 

I x^  dx  which  is  terrr.ed  the  / function  of  (o)  and  it  has  beer 

Vo  ‘ 

found  that  the  l3  function  can  be  expressed  in  terms  of  the  / function 

This  relation  is  expressed  by  the  following  equation:^ 

/3  (p  Q ) »Ti£lZM 

’ T (p+q) 


1.  Elderton,  Appendix  Ii,  p 152-156. 
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TABLE  VII 

Calculation  of  for  Type  IV 
Exaimple:  Distribution  of  Heights  of  1183  Freshmen. 


^ VTT 


aG(r^v) 


f,f-  _ F(r,vl 

e ^ G(r, v) 


r+1 


N = 1182 
r = 34.97402 


a = 13.76545 
V = -4.13389 


F(r,v)  = e_l££|^)_H(r.v) 


tan  cj>  = “ = -.118170 

(f>  = 6°. 74738  = -.1176256  (circular  measure) 

For  log  H(r, v)  : 

r = 34  r = 35 

cj?  = 6°  .3894611  .3897389 

4>  = 7°  .3894778  .3897551 

cf>  = 67.  74738  r = 34 
log  H(r,v)  = .3894611  + (.  74738)  [l67j  - -|( . 74738)  (.  25262)  [25] 
= .3894733 

4>=  6®.  74738,  r = 35 

log  H(r,v)  = .3897389  + (.  74738)  fl62]  - •!(.  74738)  (.  25262)  [2^ 

= .3897508 

(p  = 6°.  74738,  r = 36 

log  H(r.v)  = .3900011  + (.  74738)  [l58]  - -g(.  74738)  (.  25262)  [24] 

= .3900127 

Hence  6®.  74738,  r = 34.97402 

log  Jl(r,v)  = .3894733  + (.  77402)  [2775;i  - . 97402)  (.  02598)  [rl56] 

= .3897438 

log  F(r,v)  = V cj>  log  e + (r+1)  log  cos^  --3  log(r-l)+  lQ©H(r,v) 

= -.27303500 
_ N 1 


y©  = ^ 


^ F(rQ^v) 

= 161.013 


log  y^  = 2.20686204 


TABLE  VIII 

Calculation  of  tne  Curve  for  Type  IV 


Example:  Distribution  of  ueignts  of  1182  Eresnmen. 

. -V  tan''^ 

y = yo(l+  — ) e 


Origin  = Mean  + = b7, 69925858  ~ 1.62666655  = 66.07257223 

r 

Mode  = Mean  - ^ = 67.61124887 
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c 

* 

t 

JL 

a 

i 

/ X 
a 

Co  / (4)  cn, 
C-LTe.,  mess. 

Co/.toX  Co/.{5)x 

Y 

-.5158 

1.2640  .;017 

27°  11 

•37”^ 

-.4746 

-.8519  -1.8809 

1.4741 

0.3 

-.4412 

1.1946  .0772 

25°48 

•17« 

-.4155 

-.  745?  -1.4277 

. 0365 

1.0 

-. 5686 

1.1558  .0553 

20^15 

•44” 

-. 65o1 

— . *4..  022o 

. OOO'd 

4. 0 

— . 2959 

1.0875  .0564 

16°  28 

•51" 

-.2876 

-.5163  -.6?67 

1. 0169 

10.0 

-.2252 

1.0498  .0211 

12°  34 

•58  ” 

-.2196 

-.3942  -.6904 

1.4226 

26.0 

-.  1506 

1.0227  .0097 

8°35 

•44  ” 

-.1494 

-.2686  -.1800 

1.7587 

o7,0 

-.0779 

1,0061  .0026 

4°  27 

•20” 

-.0778 

-.1396  -.0486 

2. 0187 

104.0 

-.  0055 

1.0000  ,0000 

0°18 

1 711 

-. 0055 

-.0095  -.0002 

2, 1971 

157.5 

. 0674 

1.0045  .0020 

6°  51 

•15" 

.0673 

.1207  -.0364 

2.2912 

195.5 

.1400 

1.0196  .0084 

7°  58 

•15” 

.1391 

.2497  -.1559 

2. 5006 

200.0 

.2127 

1.0452  .0192 

12°  0 

•22” 

.2095 

. 5761  — . 5552 

2.2278 

169.0 

.2853 

1.0814  .0340 

15°  55 

•26” 

.2779 

.4988  -.6283 

2.0774 

119.5 

.3580 

1.1281  .0524 

19°41 

•43” 

.5437 

.6170  -.9680 

1.8558 

72.0 

.4306 

1.1854  .0739 

23°  17 

•48” 

.4066 

.7298  -15655 

1.5711 

57.0 

. 5053 

1. 255P  . 0980 

26°42 

•50"’ 

, 4662 

.8369-1.8125 

1. 2312 

17.0 

.5759 

1.3517  .1244 

29°  56 

•14” 

. 5225 

.9378  -2.2996 

.8451 

7.0 

.6485 

1.4206  .1525 

52°57 

•54” 

,5755 

1.0327-2,8187 

.4208 

3.0 

.7212 

1.5201  .1819 

o5°47 

•54” 

.6248 

1.1214  45.5623 

1.9660 

1.0 

.7958 

1.6502  .2122 

68°  26 

•37" 

.6710 

1 . 2043  -3 . 9256 

1.8876 

0.3 

ie. 

.1118 

1181.1 

1.0125  .0054 

6°  22 

•41” 

,1113 

.1198  -.0997 

2.5070 

202. 77 

An  eight-place  logarithmic  table  was  used  in  the  actual  calculation 
of  this  table. 

t 

The  values  in  circular  measures  were  obtained  to  seven  places. 

These  values  were  worked  out  to  five  places. 
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The  reason  for  expressing  in  terms  of T rather  than  /3functicns 
is  that  tables  of  r functions  have  been  made.  Due  to  the  charac- 
teristic property  of  T functions,  namely,  T(p+1)  = pT(p),  it  is 
necessary  to  tabulate  only  those  values  of  the  function  from  1 to  2 

since  all  others  can  be  reduced  to  a value  in  that  interval,  How- 

1 

ever,  for  very  large  values  of  p,  there  is  a close  approximiat ion 
of  the  form: 

logT(x+l)  = logy'^  + (x  + ^)log  X - (x  - 1^)  ® 

This  approximation  was  used  in  evaluating  y^  (Table  IX), 

After  the  value  of  log  y^  v/as  obtained,  a table  for  the  calcu- 
lation of  tne  curve  was  made,  and  the  ordinates  calculated  (Table  X) . 
The  origin  of  this  type  is  equal  to  the  mean  - — ; instead  of 
using  the  mean  expressed  in  pounds,  it  is  necessary  to  use  it  in 
termjs  of  unit  intervals  which  in  the  example  is  27.187;  this  puts 
the  origin  at  -71.6.  The  miOde  is  equal  to  the  miean  - 
which  in  tnis  case  equals  26.12.  This,  in  terms  of  pounds,  is  130.6, 
and  the  corresponding  ordinate  is  143.5, 


X.  CALCULATION  OF  THE  CURVE  OF  TYPE  V 
The  curve  of  Type  V,  expressed  by  the  equation 

-p  4 

y = yjj  X e ^ 

proved  to  fit  very  closely  the  distribution  of  the  weights  of  women 
(See  Fig. 12).  The  formulae  for  the  three  constants  y^,  p and  y are 
given  in  Elderxon  ( p 78)  and  the  proof  for  these  formulae  on  page 
82.  TheT’  function  is  again  met  in  tne  calculation  of  Yq  (for  ex- 
planation of  the  T'  function  see  tne  discussion  of  the  Curve  of  Type 
VI).  In  the  example  given  (Table  XI)  this  function  was  easily 

1.  Proof  of  this  approximation  is  given  in  Elaerton,  p 15b. 
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TABLE  IX 

Calculation  of  Constants  for  Type  VI 
Example:  Distribution  of  Weignts  of  11S2  Freshmen 

Q2  ' ~^1 
y = yo(x-a)  x 

g^i3p-A-l) 

= 6 r Ji  - 202 

Q2  and  -q]_  are  given  by 


log  r(117. 592096)  = .59908995  + (117. 092096) (2. 06666911)  - 

(116.59158126) (.43429448) 

= 191.75471442 

log  T(q3_-q2-l)  = log  T(103. 176826) 

= 162.33854039 
log  T(q2+1^'  = log  T(1441527) 

= 10.26646036 


13.415270 

-117.592096 


Using  the  approximation: 


log  yo  = 222.11534746 
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TABLE  X 

Calculation  of  the  Curve  of  Type  VI 


Example:  Distribution  of  Weights  of  1182  Freshmen 

y = yo(x-a)  ^ X ^ 

Origin  = Mean  - = 27.187394  - 98.786951  = 71.599557 

T *T»X  Q 

Mode  = Mean  - ~ ZT  = 26.121887 

2/^3 


X 

X 

lo^X* 

1.9523 

f ^ 

.4809  -229.5758 

6.4519 

4: 

y 

y 

y 

18 

89.5996 

2.9914 

.098 

0.1 

19 

90. 5996 

1.9571 

.6049  -230.1425 

8.1152 

.0880 

1.225 

1.0 

20 

91.5996 

1.9619 

.7013  -230.7031 

9.4076 

.8198 

6. 605 

7.0 

21 

92.5996 

1.9666 

.7801  -231.2577 

10.4648 

1.3224 

21.010 

31.0 

22 

93. 5996 

1.9713 

.8467  -231.8062 

11.3592 

1.6683 

46.594 

47.0 

23 

94. 5996 

1.9759 

.9045  -232.3489 

12. 1344 

1,9008 

79.591 

80.0 

24 

96.5996 

1.9805 

.9555  -332.8860 

12.8185 

2.0479 

111.662 

112.0 

25 

96.5996 

1.9850 

1.0011  -233.4174 

13.4307 

2. 1286 

134.477 

134.5 

26 

97.5996 

1.9894 

1.0424  -233.9434 

13.9846 

2.1565 

143.404 

143.  0 

27 

98.5996 

1.9939 

1.0801  -234.4639 

14.4903 

2.1417 

138.596 

139.  0 

28 

99 . 5996 

1.9983 

1.1148  -234.9792 

14.9557 

2.0918 

123.544 

123.5 

29 

100.5996 

2.0026 

1.1469  -235.4895 

15.3866 

2.0125 

102.924 

103.0 

30 

101.5996 

2. 0069 

1.1769  -235.9946 

15.7879 

1.9085 

81.015 

81.0 

31 

102.5996 

2.0111 

1.2048  -236.4948 

16 . 1632 

1.7838 

60.788 

61.0 

32 

103.5996 

2.0154 

1.2311  -236.9902 

16.5159 

1.6410 

43.757 

44.0 

33 

104.5996 

2.0195 

1.2559  -237.4808 

16.8484 

1.4829 

30.407 

30.0 

34 

105.5996 

2.0237 

1.2794  -237.9667 

17.  1629 

1.3116 

20.494 

20.5 

35 

106.5996 

2.0278 

1.3016  -2o8.4480 

17.4614 

1.1287 

13.452 

13.5 

36 

107.5996 

2.0318 

1.3228  -238.9248 

17.7453 

.9358 

8.626 

9.0 

37 

108.5996 

2. 0358 

1.3429  -239.3973 

18.0160 

.7340 

5.420 

5.0 

38 

109.5996 

2. 0398 

1 . 3622  —239 . 8654 

18.2746 

.5245 

3. 346 

3.0 

39 

110.5996 

2. 0438 

1.3807  -240.3292 

18.5223 

.3084 

2.035 

2.0 

40 

111.5996 

2.0477 

1.3984  -240.7889 

18. 7599 

. 0863 

1.220 

1.0 

41 

112.5996 

2.0515 

1.4154  -241.2445 

18.9882 

1.8589 

.723 

0.7 

1181.8 

^oJe. 

97.7214 

1.9900 

1.0472  -234.0071 

14.0486 

2, 1568 

143.506 

* Six  decimal  places  were  used  in  the  original  calculation. 
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evaluated  both  by  the  approximation  method  used  for  Type  VI  and  by 
making  use  of  the  cnaracter ist ic  property  of  the  T function  and  the 
table  of  log  T'  (p) . The  latter  method  is  shown  in  the  table. 

For  this  type  the  origin  is  expressed  as  mean  - and  using 

unit  intervals,  the  mean  for  the  example  is  2b. 59  and  the  origin  at 

10.003,  The  mode  is  equal  to  rne  mean which  in  the  ex- 

p(p-3) 

ample  is  32.05,  and  its  corresponding  ordinate  is  125.9  (Table  XII.) 

XI.  MEANiIJU  OF  CORRELATION 

Without  going  into  technicalities,  tne  meaning  of  correlation 
is  easily  explained.  If  we  have  two  series  of  paired  numbers,  e.g. , 
Heights  and  arm  lengths  of  each  individual  of  a group,  or  prices  of 
flour  and  of  cotton  on  certain  dates,  or  marks  in  two  school  subjects 
of  individual  pupils  of  a school,  we  are  interested  in  the  connec- 
tion, if  any,  between  the  two  sets  of  figures.  There  may  be  little 
or  no  connection  between  two  such  sets  of  numbers  as  in  'cne  case  of 
ages  and  heights  of  a group  of  adults.  Again,  there  may  be  a very 
close  relation,  as  would  be  shown  in  the  case  where  the  pairs  of  num- 
bers are  the  radii  and  corresponding  circumferences  of  a set  of  cir- 
cles. It  is  clear  that  the  set  of  pairs  of  numbers  representing 
height  and  arm  length  is  different  from  these  two  extreme  cases.  We 
express  the  connection  between  arm  length  and  height  roughly  by  say- 
ing that  in  general  a tall  man  has  a long  arm,  but  given  the  height 
of  a man  we  cannot  compute  the  length  of  his  arm.  We  need  something 
besides  the  words  "in  general"  to  tell  us  how  the  two  sets  of  numiber£ 
are  related  or  correlat ed.  we  seek  then  some  number  which  will 
measure  for  us  the  degree  of  this  relationship  and  of  this  correla- 
t ion  between  paired  numbers.  Many  measures  have  been  devised  but  the 
one  in  widest  use  is  tne  measure  often  called  Pearson's  coefficient 
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TABLE  XI 

Calculation  of  Constants  for  Type  V 
Example:  Distribution  of  Weights  of  971  College  Women. 


y = y, 


*“P  X 
X e 


p = 4 + 8 = 17.652903 

Pi 

Y = (p-2)^A^Tp-^  ~ 212.601063,  the  sign  of  y is  the  same  as  tnat  of>- 

. p-1 

T(ip-i) 

Since  T(p)  = (p-l) r(p-l) , 
log  T(16. 6529)  = log  15.6529  = 1.19459481 

14.6529  = 1.16592359 

13.6529  = 1.13522491 

12.6529  = 1.10319008 

11.6529  = 1.06643402 

10.6529  = 1.02746785 

9.6529  = .98465781 

8.6529  = .93716168 

7.6529  = .88382604 

6.6529  = .82301100 

5.6529  = .75227130 

4.6529  = .66772372 

3.6529  = .56263778 

2.6529  = .42372088 

1.6529  = .21824658 

log  T(1.6529)=  1.9545121 

12.89960415 


log  Vq  = 28.848329 
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TASLE  XII 

Calculation  of  the  Curve  of  Type  V 

Example:  Distribution  of  Weights  of  971  College  Women. 

Y 


y = X 


Y 


“P  e X 


Origin  = Mean  - — r-  = 23.585479  - 13.582313  = 10.003266 

p-3 


Mode  = Mean  - 


2V 


p(p-2) 


22.046671 


X 

X 

log  X -p  log  X 

1 

X 

(-yiog  e) 

log  y 

y 

y 

17 

7.003266 

.84530062  -14.922010 

13. 184058 

. 742261 

5.5241 

5 . £ 

18 

8. 003266 

.90326725  -15.945289 

11.536724 

1. 366316 

23. 244 

23. C 

19 

y.00o266 

.95440009  -16.847932 

10.255330 

1.745067 

55.599 

56.  C 

20 

10.003266 

1.00014181  -17.655406 

— 

9, 230132 

1.962791 

91.789 

92.  C 

21 

11.003266 

1.04152161  -18.385880 

8.391278 

2.071171 

117.807 

118. C 

22 

12.003266 

1.07929943  -19.052768 

— 

7.692195 

2. 103366 

126.872 

127.  ( 

23 

lo . 003266 

1.11405244  -19.666260 

— 

7. 100637 

2.081432 

120.624 

121.  C 

24 

14.  003266 

1.14622934  -20.234275 

6.593567 

2.020487 

104.831 

105.  C 

25 

15. 003266 

1.17618581  -20. 76o094 

— 

6.3.54091 

1.931144 

85.338 

85. C 

26 

16.003266 

1.20420862  -21.257778 

- 

5.769539 

1.821012 

66. 224 

66.  C 

27 

17.003266 

1.23053235  -21.723469 

— 

5.430214 

1.695646 

49.619 

50.  C 

28 

18.003266 

1.25535130  -22.160595 

5. 128595 

1.559139 

36 , 236 

36.  C 

29 

19.003266 

1.27882824  -22.575031 

— 

4.858716 

1.414582 

25.977 

26.  C 

30 

20. 003366 

1.30110090  -22.968208 

— 

4.615820 

1.264d01 

18.678 

18.  C 

31 

21.003266 

1.32228683  -23.342201 

— 

4.396053 

1.110075 

12. 885 

13. C 

32 

22. 003266 

1.34248715  -23.698795 

— 

4. 196262 

.953272 

8.979 

9.C 

33 

23.003266 

1.36178950  -24.039538 

— 

4.013842 

. 794949 

6. 236 

6.  C 

34 

24.003266 

1.38027034  -24.365778 

— 

3. 846621 

.635930 

4.624 

4.( 

35 

25.003266 

1.39799674  -24.678701 

— 

3.692776 

.476852 

2.998 

3.C 

36 

26. 003266 

1.41402790  -24.961697 

— 

3.550764 

. 335868 

2.167 

2.C 

37 

27.003266 

1.43141630  -25.268653 

— 

3.419270 

. 160406 

1.446 

l.£ 

38 

28. 003266 

1.44720869  -25.547435 

- 

3. 297168 

.003726 

1.00$ 

l.C 

39 

29.003266 

1.46244690  -25.816433 

— 

3. 183485 

9.848411 

.705 

l.C 

40 

30. 003266 

1.47716853  -26.076313 

- 

3.077381 

9.694635* 

.495 

0.£ 

41 

31. 003266 

1.49140745  -26.327671 

- 

2.978121 

9.542537 

.348 

0. 

969.6 

12.043405 

1.08074929  -19.078362 

— 

7.666558 

2. 103409 

126.885 

* -10 
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of  correlation,  or  the  product -morrient  coefficient,  or  simply  the 
correlation  coefficient.  It  is  universally  represented  by  the  lettei 

r. 

The  correlation  coefficient  varies  from  -1  to  +1.  For  the  ex- 
tremes of  the  range  the  correlation  is  perfect.  In  perfectly  cor- 
related series,  given  one  of  any  pair  of  numbers,  we  can  find  the 
other  by  solving  a linear  algebraic  equation.  An  example  of  this 
perfect  correlation  is  that  of  the  radii  and  circumferences  of  cir- 
cles in  Which  tne  correlation  coefficient  is  +1.  If  a decrease  in 
one  of  the  pairs  of  a perfectly  correlated  series  is  accompanied  by 
an  increase  in  the  other,  the  coefficient  is  -1,  If  r=0  there  is  no 
tendency  to  this  linear  relationship  though  there  may  be  a close  con- 
nection of  another  kind. 

The  correlation  coefficient  then  may  be  described  as  a mieasure 
of  the  approach  to  linear  relationship. 

XII.  CALCULATION  OF  r 

Tne  correlation  coefficient,  r,  is  defined  by  the  formula 

r = 

where  x and  y are  the  deviations  from  their  respective  means.  O'  and 
CTy  are  the  standard  deviations  of  the  two  distributions  and  W is  the 
total  number  of  observations.  In  the  given  sets  of  data  of  heights 
and  weights,  y represents  height  and  x represents  weight,  an  arrange- ■ 
ment  which  is  purely  arbitrary,  for  y might  just  as  well  have  stood 
for  weight  and  x for  height  as  far  as  results  are  concerned.  How- 
ever, as  it  has  been  arranged,  the  columins  give  the  frequencies  of 
each  weight  int erval distributed  according  to  height,  and  the  rows 
give  the  frequencies  of  each  unit  of  height  distributed  according 
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to  weight.  The  total  frequency  distributions  for  tne  rows  and  col- 
umns give,  as  is  to  be  expected,  tne  original  frequency  distributions 
(cf  Tables  II  ana  Xlii).  If  there  seems  at  times  to  be  a discrepancy 
It  is  due  to  the  fact  that  in  making  the  correlation  tables,  tne 
measurements  wnicn  occurred  on  the  division  point  between  intervals 
were  not  divided  as  in  the  original  distributions,  but  were  given 
alternately  to  the  class  above  and  to  the  class  below,  i^/iany  times 
the  number  of  such  occurrences  was  not  even  and  one  of  the  classes 
received  one  more  observation  than  was  given  by  the  original  tables. 
The  discrepancy  of  several  hundredths  which  is  seen  when  tne  original 
standard  deviations  are  compared  with  those  of  the  correlation  tables 
IS  due  to  the  same  cause. 

The  method  used  for  finding  r is  given  by  Table  XIII.  In  that 
table,  d represents  tne  deviation  from  any  arbitrary  point,  taken 
as  in  the  summation  method  given  above,  so  that  tne  least  possible 
arithmetical  work  is  involved.  In  the  example  (19  year  mien)  this 
arbitrary  origin  was  taken  at  height  68  ana  at  weight  Ibb  as  was  tne 
case  in  obtaining  the  means  and  standard  deviations  of  the  distri- 
butions in  order  to  compare  one  with  tne  other,  so  in  finding  the 
standard  deviations  which  are  necessary  for  the  calculation  of  the 
correlation  coefficient,  the  method  of  obtaining  the  moments  can  be 
used,  since  as  was  said  before  the  formu.la  for  the  square  cf  the 

standard  deviation  is  that  of  the  second  moment  ) . Since  x is 

N 

the  deviation  from  tne  mean,  the  -£fx  must  necessarily  be  0,  but  us- 
ing d,  the  aeviation  from  an  arbitrary  point,  the  ^fd  is  not  equal 

to  0 and  in  order  to  get  tne  standard  aeviation  about  the  mean  tne 

f d^  ^ f d 

summation  — ;:r-  must  be  decreased  by  the  square  of  o(=‘^=^);  that  is, 


N 


/T  ^fd^  ,^f  d^  ® 

y “ — W~  " value  a given  in  the  vertical  column  of  the 
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table,  is  the  value  of  the  frequency,  as  given  in  each  square  in 
eacn  row,  multiplied  by  the  respective  deviation  d of  the  weights 
(given  in  tne  row  just  below  the  table).  Tne  quantity  ad  then  gives 
tne  product  of  the  x and  y deviations  from  an  arbitrary  point  weight- 
ed by  tne  proper  frequency,  and  the  row  and  column  ad  must  check.  In 
order  to  nave  the-f^xy  about  the  mean,  must  be  decreased  by  the 
product  c^Cy  as  shown  in  the  table.  Then  we  have 


r = 


<^ad  « CvCv 
~¥  ^ 

(T  cr 

^x  y 


In  the  examiple,  r = ,5  approximately. 

The  remaining  groups  show  correlations  whose  values  are  near 
that  of  the  example  given  above.  The  20  year  group  is  the  only  one 
with  a correlation  higher  than  .b;  with  its  coefficient  of  .516,  it 
seem^s  significant  that  the  most  miature  group  averages  tallest  and 
heaviest,  and  also  snows  the  greatest  tendency  for  heaviness  and 
tallness,  lightness  and  shortness  to  be  associated.  In  agreemient 
with  this  we  see  that  the  youngest  group  shows  the  least  assoc iatior 
between  weight  and  height,  with  a coefficient  of  .425.  The  corre- 
lation Of  the  weights  and  heights  of  the  18  year  group  is  expressed 
by  a coefficient  of  .495,  while  the  freshman  group  has  a slightly 
lower  coefficient  of  .478.  Tne  heights  and  iveights  of  women  showed 
the  least  correlation  of  all  the  groups,  their  correlation  being  ,411, 

Since  there  is  perfect  correlation  when  r = ±1  and  since  the 
variables  are  independent  when  r =0,  we  conclude  that  there  is  a 
fair  amount  of  correlation  between  heights  and  weights,  or  that  weigl: ; 
is  to  somje  extent  dependent  upon  height. 


1.  See  0. B. Davenport  ’’Statistical  Methods”,  p 20,45;  also  Elderton 
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Calculation  of  Correlation  Coefficient  for 
Height, ana  weight  of  991  Freshmen  Age  19 
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XIII.  SUMARY 

To  sum  up  this  investigation  I have  made  a table  to  show  in  a 
concise  way  how  much  the  means  and  correlation  coefficients  of  the 
heights  and  weights  of  the  various  classes  of  freshmen  differ.  I 
have  expressed  the  means  with  their  probable  errors,  which  give  the 
measure  of  unreliability  of  values  calculated;  the  probable  error  is 
a pair  of  values  lying  one  above  and  the  other  below  the  value  deter' 
mined,  and  the  true  value  lies  some\7here  between  these  limits.  The 
fig-ares  which  follow,  sum  up  the  various  distributions  and  indicate 
the  closeness  with  which  anthropometric  measurements  are  fit  by  the 
normal  curve  and  by  several  other  types. 


Group 

Mean 

Height 

Mean 

Weight 

Correlation 

Coefficient 

Freshman  Class 

67.699±. 047 

135.94+.34 

.478 

17 

year 

Freshmen 

67.826±.064 

134. 75±.  48 

.425 

18 

year 

Freshmen 

67.73G±.050 

135. 44±.  36 

.495 

19 

year 

Freshmen 

67.859±.051 

137.51±.  36 

.500 

20 

year 

Freshmen 

67.985±.  053 

139,01±.40 

.516 

V/omen 

63.313±,048 

117.93+.38 

.411 
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